We explore value-based solutions for multi-agent reinforcement learning (MARL) tasks in the centralized training with decentralized execution (CTDE) regime popularized recently. However, VDN and QMIX are representative examples that use the idea of factorization of the joint actionvalue function into individual ones for decentralized execution. VDN and QMIX address only a fraction of factorizable MARL tasks due to their structural constraint in factorization such as additivity and monotonicity. In this paper, we propose a new factorization method for MARL, QTRAN, which is free from such structural constraints and takes on a new approach to transforming the original joint action-value function into an easily factorizable one, with the same optimal actions. QTRAN guarantees more general factorization than VDN or QMIX, thus covering a much wider class of MARL tasks than does previous methods. Our experiments for the tasks of multi-domain Gaussian-squeeze and modified predator-prey demonstrate QTRAN's superior performance with especially larger margins in games whose payoffs penalize non-cooperative behavior more aggressively.
translated by 谷歌翻译
受数字孪生系统的启发,开发了一个新型的实时数字双框架,以增强机器人对地形条件的感知。基于相同的物理模型和运动控制,这项工作利用了与真实机器人同步的模拟数字双重同步,以捕获和提取两个系统之间的差异信息,这两个系统提供了多个物理数量的高维线索,以表示代表差异建模和现实世界。柔软的,非刚性的地形会导致腿部运动中常见的失败,因此,视觉感知完全不足以估计地形的这种物理特性。我们使用了数字双重来开发可折叠性的估计,这通过动态步行过程中的物理互动来解决此问题。真实机器人及其数字双重双重测量之间的感觉测量的差异用作用于地形可折叠性分析的基于学习的算法的输入。尽管仅在模拟中受过培训,但学习的模型可以在模拟和现实世界中成功执行可折叠性估计。我们对结果的评估表明,对不同方案和数字双重的优势的概括,可在地面条件下可靠地检测到细微差别。
translated by 谷歌翻译
市场需求紧迫,以最大程度地减少迅速伽马中子激活分析(PGNAA)光谱测量机的测试时间,以便它可以充当即时材料分析仪,例如立即对废物样品进行分类,并根据测试样品的检测成分确定最佳的回收方法。本文介绍了深度学习分类的新开发,并旨在减少PGNAA机器的测试时间。我们提出随机采样方法和类激活图(CAM)以生成“缩小”样品并连续训练CNN模型。随机采样方法(RSM)旨在减少样品中的测量时间,而类激活图(CAM)用于滤除缩小样品的不太重要的能量范围。我们将总PGNAA测量时间缩短到2.5秒,同时确保我们的数据集的精度约为96.88%,该数据集使用12种不同的物质。与分类不同的材料分类相比,具有相同元素以归档良好精度的物质需要更多的测试时间(样品计数率)。例如,铜合金的分类需要将近24秒的测试时间才能达到98%的精度。
translated by 谷歌翻译
随着在充满挑战的环境中越来越需要多机器人探索未知区域的需求,需要有效的协作探索策略来实现此类壮举。可以部署基于边界的快速探索随机树(RRT)探索来探索未知的环境。然而,它的贪婪行为导致多个机器人探索收入最高的地区,从而导致勘探过程中大规模重叠。为了解决这个问题,我们提出了基于时间内存的RRT(TM-RRT)探索策略,用于多机器人在未知环境中执行强大的探索。它根据每个机器人的相对位置计算分配的每个边界的自适应持续时间,并计算边界的收入。此外,每个机器人都配备了由分配的边界和舰队共享的内存,以防止重复对同一边界的分配。通过模拟和实际部署,我们通过在25.0m x 540m(1350.0m2)区域完成勘探,展示了TM-RRT勘探策略的鲁棒性,而常规的RRT勘探策略则不足。
translated by 谷歌翻译
具有多级连接的深度神经网络,以复杂的方式进程输入数据来了解信息。网络学习效率不仅取决于复杂的神经网络架构,还取决于输入训练图像。具有用于头骨剥离或肿瘤的深神经网络的Medical图像分段。来自磁共振图像的分割使得能够学习图像的全局和局部特征。虽然收集在受控环境中的医学图像,但可能存在导致输入集中固有偏差的伪影或基于设备的方差。在本研究中,我们调查了具有神经网络分割精度的MR图像的图像质量指标的相关性。我们使用了3D DenSenet架构,并让网络在相同的输入上培训,但应用不同的方法来基于IQM值选择训练数据集。基于随机训练的模型之间的分割精度的差异基于IQM的训练输入揭示了图像质量指标对分割精度的作用。通过运行图像质量指标来选择培训输入,进一步调整网络的学习效率和分割精度。
translated by 谷歌翻译
众所周知,视觉分类模型在数据分布班面上遭受较差的校准。在本文中,我们对此问题采取了几何方法。我们提出几何灵敏度分解(GSD)将样本特征嵌入的标准分解为目标分类器的示例特征嵌入和角度相似度分解为依赖于实例和实例 - 独立的组件。实例相关组件捕获关于输入中的更改的敏感信息,而实例无关的组件仅表示仅用于最小化训练数据集的丢失的不敏感信息。灵感来自分解,我们分析了一个简单的扩展到当前的SoftMax-Linear模型,这在训练期间学会解开两个组件。在几种常见视觉模型上,脱谕式模型在面对配送(OOD)数据和腐败方面的标准校准度量上的其他校准方法表现出明显不那么复杂。具体而言,我们将当前技术超越30.8%的相对改善对预期校准误差的损坏的CIFAR100。代码在https://github.com/gt-ripl/geometric -sentivity-decomposition.git。
translated by 谷歌翻译
隆升建模是一种因果学习技术,可估计亚组级别的治疗效果。它通常在行业和其他地方用于定位广告等任务。在典型的设置中,Ruplift模型可以将数千个功能作为输入,这是昂贵的,并且导致了诸如过度拟合和模型可解释性差的问题。因此,需要选择建模最重要的功能的子集。但是,进行功能选择的传统方法不适合该任务,因为它们是为标准的机器学习模型而设计的,其目标与隆升模型重要。为了解决这个问题,我们介绍了一组针对提升建模的特征选择方法,从统计和信息理论中汲取灵感。我们对公开可用数据集的拟议方法进行了经验评估,证明了与传统特征选择相比,提出的方法的优势。我们将建议的方法公开作为Causalml开源软件包的一部分。
translated by 谷歌翻译
Transfer of pre-trained representations improves sample efficiency and simplifies hyperparameter tuning when training deep neural networks for vision. We revisit the paradigm of pre-training on large supervised datasets and fine-tuning the model on a target task. We scale up pre-training, and propose a simple recipe that we call Big Transfer (BiT). By combining a few carefully selected components, and transferring using a simple heuristic, we achieve strong performance on over 20 datasets. BiT performs well across a surprisingly wide range of data regimes -from 1 example per class to 1 M total examples. BiT achieves 87.5% top-1 accuracy on ILSVRC-2012, 99.4% on CIFAR-10, and 76.3% on the 19 task Visual Task Adaptation Benchmark (VTAB). On small datasets, BiT attains 76.8% on ILSVRC-2012 with 10 examples per class, and 97.0% on CIFAR-10 with 10 examples per class. We conduct detailed analysis of the main components that lead to high transfer performance.
translated by 谷歌翻译
Weakly-supervised object localization aims to indicate the category as well as the scope of an object in an image given only the image-level labels. Most of the existing works are based on Class Activation Mapping (CAM) and endeavor to enlarge the discriminative area inside the activation map to perceive the whole object, yet ignore the co-occurrence confounder of the object and context (e.g., fish and water), which makes the model inspection hard to distinguish object boundaries. Besides, the use of CAM also brings a dilemma problem that the classification and localization always suffer from a performance gap and can not reach their highest accuracy simultaneously. In this paper, we propose a casual knowledge distillation method, dubbed KD-CI-CAM, to address these two under-explored issues in one go. More specifically, we tackle the co-occurrence context confounder problem via causal intervention (CI), which explores the causalities among image features, contexts, and categories to eliminate the biased object-context entanglement in the class activation maps. Based on the de-biased object feature, we additionally propose a multi-teacher causal distillation framework to balance the absorption of classification knowledge and localization knowledge during model training. Extensive experiments on several benchmarks demonstrate the effectiveness of KD-CI-CAM in learning clear object boundaries from confounding contexts and addressing the dilemma problem between classification and localization performance.
translated by 谷歌翻译
Knowledge graph embedding (KGE), which maps entities and relations in a knowledge graph into continuous vector spaces, has achieved great success in predicting missing links in knowledge graphs. However, knowledge graphs often contain incomplete triples that are difficult to inductively infer by KGEs. To address this challenge, we resort to analogical inference and propose a novel and general self-supervised framework AnKGE to enhance KGE models with analogical inference capability. We propose an analogical object retriever that retrieves appropriate analogical objects from entity-level, relation-level, and triple-level. And in AnKGE, we train an analogy function for each level of analogical inference with the original element embedding from a well-trained KGE model as input, which outputs the analogical object embedding. In order to combine inductive inference capability from the original KGE model and analogical inference capability enhanced by AnKGE, we interpolate the analogy score with the base model score and introduce the adaptive weights in the score function for prediction. Through extensive experiments on FB15k-237 and WN18RR datasets, we show that AnKGE achieves competitive results on link prediction task and well performs analogical inference.
translated by 谷歌翻译